<h3>Chapter 5: Advanced String Handling</h3>
<p><strong>5.1 What a String Is</strong></p>
<p>
The LPC Basics textbook taught strings as simple data types.
LPC generally deals with strings in such a matter.
The underlying driver program, however, is written in C, which has no string
data type. The driver in fact sees strings as a complex data type made up
of an array of characters, a simple C data type. LPC, on the other hand
does not recognize a character data type (there may actually be a driver
or two out there which do recognize the character as a data type, but in
general not). The net effect is that there are some array-like things you
can do with strings that you cannot do with other LPC data types.
</p>
<p>
The first efun regarding strings you should learn is the strlen()
efun. This efun returns the length in characters of an LPC string, and is
thus the string equivalent to sizeof() for arrays. Just from the behaviour
of this efun, you can see that the driver treats a string as if it were
made up of smaller elements. In this chapter, you will learn how to deal
with strings on a more basic level, as characters and sub strings.
</p>
<p>
<hr size="1">
<p><strong>5.2 Strings as Character Arrays</strong></p>
<p>You can do nearly anything with strings that you can do with arrays,
except assign values on a character basis. At the most basic, you can actually
refer to character constants by enclosing them in '' (single quotes). 'a' and
"a" are therefore very different things in LPC. 'a' represents a character
which cannot be used in assignment statements or any other operations except
comparison evaluations. "a" on the other hand is a string made up of a single
character. You can add and subtract other strings to it and assign it as a
value to a variable.
</p>
<p>
With string variables, you can access the individual characters
to run comparisons against character constants using exactly the same syntax
that is used with arrays. In other words, the statement:
<blockquote>
if( str[2] == 'a')
</blockquote>
is a valid LPC statement comparing the second character in the str string
to the character 'a'. You have to be very careful that you are not comparing
elements of arrays to characters, nor are you comparing characters of
strings to strings.
</p>
<p>
LPC also allows you to access several characters together using
LPC's range operator .. :
<blockquote>
if( str[0..1] == "ab")
</blockquote>
</p>
<p>
In other words, you can look for the string which is formed
by the characters 0 through 1 in the string str. As with arrays, you must
be careful when using indexing or range operators so that you do not try
to reference an index number larger than the last index. Doing so will result
in an error.
</p>
<p>
Now you can see a couple of similarities between strings and arrays:
<blockquote>
1) You may index on both to access the values of individual elements.
<blockquote>
a) The individual elements of strings are characters<br>
b) The individual elements of arrays match the data type of the array.
</blockquote>
2) You may operate on a range of values
<blockquote>
a) Ex: "abcdef"[1..3] is the string "bcd"<br>
b) Ex: ({ 1, 2, 3, 4, 5 })[1..3] is the int array ({ 2, 3, 4 })
</blockquote>
</blockquote>
</p>
<p>
And of course, you should always keep in mind the fundamental
difference: a string is not made up of a more fundamental LPC data type.
In other words, you may not act on the individual characters by assigning
them values.
</p>
<hr size="1">
<p><strong>5.3 The Efun sscanf()</strong></p>
<p>
You cannot do any decent string handling in LPC without using sscanf().
Without it, you are left trying to play with the full strings passed by
command statements to the command functions. In other words, you could
not handle a command like: "give sword to leo", since you would have no
way of separating "sword to leo" into its constituent parts. Commands such
as these therefore use this efun in order to use commands with multiple
arguments or to make commands more "English-like".
</p>
<p>
Most people find the manual entries for sscanf() to be rather difficult
reading. The function does not lend itself well to the format used by
manual entries. As I said above, the function is used to take a string
and break it into usable parts. Technically it is supposed to take a string
and scan it into one or more variables of varying types. Take the example
above:
<pre>
int give( string str ) {
    string what, whom;

    if( !str )
        return notify_fail("Give what to whom?\n");

    if( sscanf( str, "%s to %s", what, whom ) != 2 )
        return notify_fail("Give what to whom?\n");

    ... rest of give code ...

}
</pre>
</p>
<p>
The efun sscanf() takes three or more arguments. The first
argument is the string you want scanned. The second argument is called a
control string. The control string is a model which demonstrates in what
form the original string is written, and how it should be divided up. The
rest of the arguments are variables to which you will assign values based
upon the control string.
</p>
<p>
The control string is made up of three different types of elements: 1)
constants, 2) variable arguments to be scanned, and 3) variable arguments
to be discarded. You must have as many of the variable arguments in sscanf()
as you have elements of type 2 in your control string. In the above example,
the control string was "%s to %s", which is a three element control string
made up of one constant part (" to "), and two variable arguments to be
scanned ("%s"). There were no variables to be discarded.
</p>
<p>
The control string basically indicates that the function should find
the string " to " in the string str. Whatever comes before that constant
will be placed into the first variable argument as a string. The same
thing will happen to whatever comes after the constant.
</p>
<p>
Variable elements are noted by a "%" sign followed by a code for decoding
them. If the variable element is to be discarded, the "%" sign is followed
by the "*" as well as the code for decoding the variable. Common codes
for variable element decoding are "s" for strings and "d" for integers.
In addition, your mudlib may support other conversion codes, such as "f"
for float. So in the two examples above, the "%s" in the control string
indicates that whatever lies in the original string in the corresponding
place will be scanned into a new variable as a string.
</p>
<p>
A simple exercise. How would you turn the string "145" into an integer?
</p>
<p>Answer:</p>
<blockquote>
int x;<br>
<br>
sscanf("145", "%d", x );
</blockquote>
<p>
After the sscanf() function, x will equal the integer 145.
</p>
<p>
Whenever you scan a string against a control string, the function
searches the original string for the first instance of the first constant
in the original string. For example, if your string is "magic attack 100"
and you have the following:
<pre>
int improve(string str) {
    string skill;
    int x;

    if( sscanf( str, "%s %d", skill, x ) != 2 )
        return 0;

    ...

}
</pre>
you would find that you have come up with the wrong return
value for sscanf() (more on the return values later). The control string,
"%s %d", is made up of to variables to be scanned and one constant. The
constant is " ". So the function searches the original string for the first
instance of " ", placing whatever comes before the " " into skill, and trying
to place whatever comes after the " " into x. This separates "magic attack
100" into the components "magic" and "attack 100". The function, however,
cannot make heads or tales of "attack 100" as an integer, so it returns
1, meaning that 1 variable value was successfully scanned ("magic" into
skill).
</p>
<p>
Perhaps you guessed from the above examples, but the efun sscanf()
returns an int, which is the number of variables into which values from
the original string were successfully scanned. Some examples with return
values for you to examine:
<blockquote>
sscanf("swo rd descartes", "%s to %s", str1, str2) return: 0<br>
sscanf("swo rd descartes", "%s %s", str1, str2) return: 2<br>
sscanf("200 gold to descartes", "%d %s to %s", x, str1, str2) return: 3<br>
sscanf("200 gold to descartes", "%d %*s to %s", x, str1) return: 2
</blockquote>
where x is an int and str1 and str2 are strings.
</p>
<hr size="1">
<p><strong>5.4 Summary</strong></p>
<p>
LPC strings can be thought of as arrays of characters, yet
always keeping in mind that LPC does not have the character data type
(with most, but not all drivers). Since the character is not a true LPC
data type, you cannot act upon individual characters in an LPC string
in the same manner you would act upon different data types. Noticing the
intimate relationship between strings and arrays nevertheless makes it
easier to understand such concepts as the range operator and indexing
on strings.
</p>
<p>
There are efuns other than sscanf() which involve advanced string handling,
however, they are not needed nearly as often. You should check on your
mud for man or help files on the efuns: explode(), implode(), replace_string(),
sprintf(). All of these are very valuable tools, especially if you intend
to do coding at the mudlib level.
</p>
